A splice site mutation in the FvePHP gene is associated with leaf development and flowering time in woodland strawberry

Abstract Leaves and flowers are crucial for the growth and development of higher plants. In this study we identified a mutant with narrow leaflets and early flowering (nlef) in an ethyl methanesulfonate-mutagenized population of woodland strawberry (Fragaria vesca) and aimed to identify the candidate gene. Genetic analysis revealed that a single recessive gene, nlef, controlled the mutant phenotype. We found that FvH4_1g25470, which encodes a putative DNA polymerase α with a polymerase and histidinol phosphatase domain (PHP), might be the candidate gene, using bulked segregant analysis with whole-genome sequencing, molecular markers, and cloning analyses. A splice donor site mutation (C to T) at the 5′ end of the second intron led to an erroneous splice event that reduced the expression level of the full-length transcript of FvePHP in mutant plants. FvePHP was localized in the nucleus and was highly expressed in leaves. Silencing of FvePHP using the virus-induced gene silencing method resulted in partial developmental defects in strawberry leaves. Overexpression of the FvePHP gene can largely restore the mutant phenotype. The expression levels of FveSEP1, FveSEP3, FveAP1, FveFUL, and FveFT were higher in the mutants than those in ‘Yellow Wonder’ plants, probably contributing to the early flowering phenotype in mutant plants. Our results indicate that mutation in FvePHP is associated with multiple developmental pathways. These results aid in understanding the role of DNA polymerase in strawberry development.


Introduction
Strawberry is a small fruit crop with high economic value and is widely cultivated around the world [1]. Flowering time is a crucial factor in strawberry production, and elucidating the molecular mechanism of strawberry f lowering can lay the theoretical foundation for breeding new cultivars. In this study we identified a mutant with narrow leaf lets and early f lowering (nlef ) phenotypes in woodland strawberry (Fragaria vesca).
Leaves are derived from the shoot apical meristem (SAM), which plays a crucial role in plant growth and development [2,3]. Leaves store energy and are involved in processes such as photosynthesis, gas exchange, plant architecture determination, thermoregulation, and environmental adaptation in higher plants [4]. Leaf characteristics such as leaf shape, leaf size, and leaf margin affect plant growth. Internal and external factors inf luence the process of determination of leaf characteristics during leaf development. The internal factors include hormones, genes, and small RNAs [5][6][7], and the external environmental factors include light, temperature, and humidity [8]. Several recent studies have focused on leaf shape in several crops, such as rice, maize, tomato, and wheat [9][10][11][12][13][14]. For instance, a study reported that tomato fruit quality and cultivar performance can be predicted based on leaf shape [15]. However, the relationship between leaf shape and agronomic traits still needs to be explored.
Flowers are also derived from the SAM and are crucial for plant reproduction. The yield of many crops is closely related to the f lowering time, and the regulation of f lowering time involves a complex signaling network inf luenced by both external factors, such as the photoperiod and vernalization, and internal factors, such as the gibberellin pathway and the autonomous f lowering pathway [16,17]. These pathways inf luence the expression of f lowering-related genes such as FLOWERING LOCUS T (FT), APETALA 1 (AP1), and LEAFY (LFY), and regulate f lowering [17]. In addition, epigenetic components such as chromatin remodeling and histone modifications are involved in f loral transition [18].
In the post-transcriptional process, alternative splicing produces multiple mRNAs from a single pre-mRNA, generating various protein sequences with various activities and new proteins [19]. In higher plants, alternative splicing plays an important role in the response to environmental stimuli and regulates plant development processes such as seed germination and plant f lowering [19,20]. Alternative splicing is classified into five types with differing patterns: intron-retained, exon skipping, alternative 5 /3 splice sites, alternative first/end exons, and mutually exclusive exons [21]. During the splicing process, the spliceosome splits the intron by identifying the splice sites and the sequence in the 5 region of the intron to splice the corresponding exon sequence. Mutations in the splice sites exert a major inf luence on splicing events. For example, an alternative splice site mutation in the third subunit of a DNA polymerase accelerates f lowering in Brachypodium distachyon by promoting the expression of f loweringrelated genes [22]. In Arabidopsis thaliana, two DNA polymerases, AtPOLH and AtREV1, play a key role in DNA replication and are regulated by alternative splicing [23]. In cucumber, exon skipping in CsSEP2 affects f loral organs and fruit development [24].
In this study, using bulked segregant analysis (BSA) with wholegenome sequencing, we identified an SNP mutation that located the splice site of the FvePHP gene and is associated with leaf shape and f lowering regulation in woodland strawberry.

Phenotypic characterization of the nlef mutant
Early-f lowering mutants are key resources that facilitate the study of f lowering mechanisms. In this study we showed that an nlef mutant in an ethyl methanesulfonate-mutagenized M 2 population of woodland strawberry accession 'Yellow Wonder' (YW) (Fig. 1). Overall plant size, as measured by diameter and height, are decreased in the nlef mutant. The leaf let length, leaf let width, and the length of petioles, number of serrations, and leaf let area are also decreased in the mutant. The ration of leaf let length to leaf let width is increased in the mutant, which indicates that the leaf lets of the mutant are narrow (Supplementary Data Table S4, Fig. 2a-g). These data suggest that the overall plant size of the mutant is smaller than that of YW.
To observe and record the f lowering of nlef plants, YW and nlef plants were planted under strict chamber conditions. As Fig. 2i shows, the mutant strawberry nlef bloomed earlier than YW. On the 64th day after planting (DAP) 100% of mutant strawberry plants had bloomed, whereas at this time 42% of the YW had bloomed. Thus, an early-f lowering phenotype was also observed in the mutant, besides the obvious defects in both leaf shape and morphological structure.

Leaf epidermal cells of 'Yellow Wonder' and nlef mutant were observed by scanning electron microscopy
Since cell size and division play crucial roles in plant organ development, we observed epidermal cells of young leaves of YW and the nlef mutant using scanning electron microscopy. The results indicated that the abaxial epidermal cells of nlef mutant were significantly smaller than those of YW (Fig. 3).

Genetic analysis of nlef
To identify the candidate gene for the nlef phenotype, the nlef mutant as the female parent was hybridized with the woodland  strawberry accession 'Ruegen' (RG) to generate an F 1 generation. Then, an F 2 population was obtained by self-pollination of F 1 , which included 277 RG plants and 97 nlef plants. The ratio of RG plants to nlef plants was ∼3:1 (χ 2 = 0.128; χ 2 0.05 = 3.84), indicating that the nlef phenotype was controlled by a single recessive locus.

Identification of candidate region based on bulk segregant analysis sequencing
Through genome sequencing of RG, nlef mutant, and two DNA pools from 37 mutants and 37 RG plants of the F 2 population, we obtained 51 920 640, 19 635 770, 45 181 986, and 52 513 370 highquality reads, respectively. The Q30 quality scores for the clean  Table S2).
To narrow the candidate interval, we increased the number of F 2 individuals from other five F 1 plants. Six InDel markers were developed to test for their linkage to the mutant locus by genotyping 406 recessive individuals from the F 2 individuals from other five F 1 plants (Fig. 4b). Subsequently, the candidate region was delimited to the 1.94 Mb region between 16.30 and 18.24 Mb (Fig. 4b).

Identification of the candidate gene
In this study a total of 203 genes were annotated in the 1.94-Mb candidate region, and 3431 SNPs and 887 InDels were contained in this candidate region. According to the annotation of SNPs and InDels, ∼33% of SNPs/InDels were located upstream, ∼23% were downstream of the genes, ∼31% in the intergenic regions, ∼16% in the introns, and only ∼4.2% of the SNPs and ∼ 0.3% of the InDels were located in the coding sequence (CDS) regions (Supplementary Data Table S5). On further analysis, we found that five homozygous mutations were located upstream of the genes shown in Table 2 and two homozygous mutations were located downstream of the genes ( Table 2). In addition, three homozygous mutations were located in the introns of genes and one homozygous mutation was located in the intergenic region. None of these mutations changed the CDS. Two SNPs located in splice sites caught our attention ( Table 2). The splice site plays a crucial role in the resulting protein sequence. Therefore, we investigated whether the genes themselves give rise to the splicing event. RT-PCR was used to detect splicing events and no distinction was observed in FvH4_1g25130 between YW and the nlef mutant, while FvH4_1g25470 in YW and the nlef mutant had one and two bands in the gel electrophoresis, respectively ( Fig. 4d and e). In addition, RNA-seq of YW and the nlef mutant showed that FvH4_1g25470 was the only gene that displayed significantly differential expression between YW and the nlef mutant in the 1.94 Mb candidate regions [log 2 FC < −1, false discovery rate (FDR) < 0.05; Supplementary Data Table S3, Supplementary Data Fig. S1].
A dCAPs marker of FvH4_1g25470 was developed based on (pos 17,218,907 C-T,). The result showed that SNP 17,218,907 of FvH4_1g25470 co-segregated with the mutation phenotype in 6 recombinant individuals between the region Indel-9780 and Indel-9699 ( Fig. 4c).
To further investigate the mutation of the splice site, specific primers were designed with sizes between 1464 and 2112 bp for PCR amplification using DNA from the YW and mutant plants as templates. Sequencing of the PCR products indicated that the locus had a homozygous mutation (C in YW mutated to T in the mutant plants) (Fig. 5b). These results indicated that FvH4_1g25470 might be a candidate gene. Moreover, analysis of FvH4_1g25470 using the SMART website (http:// smart.embl-heidelberg.de/) showed that FvH4_1g25470 encodes a hypothetical protein with one DNA polymerase α domain and one polymerase and histidinol phosphatase (PHP) domain (Fig. 5d). Herein, FvH4_1g25470 is named FvePHP.

Cloning and bioinformatic analysis of FvePHP, FvePHP-IR, and FvePHP-PED
To identify the sequence differences, full-length CDSs from YW and the nlef mutant were amplified. The results showed one band in YW and two bands in the nlef mutant (Fig. 5a), and one band in the nlef mutant was larger than that in YW. Sequencing results showed two new transcripts in the nlef mutant due to mutations. One transcript contained one intron, suggesting an intron retention (IR) event in alternative splicing. In the other transcript, a part of an exon was found to be deleted (partial exon deletion, PED). Therefore, these transcripts were named FvePHP-IR and FvePHP-PED, respectively (Fig. 5b, Supplementary Data Fig. S2). According to genomic annotation information, FvePHP has five exons and four introns. Its full-length CDS has 1338 bp and encodes 445 amino acids and a termination codon. In mutant transcripts the full-length CDS and amino acids were altered. FvePHP-IR had a 456-bp CDS that encoded 151 amino acids and a termination codon in the second intron (Fig. 5c). FvePHP-PED had a 1281-bp CDS that encoded 427 amino acids and a termination codon, due to a 54-bp deletion in the second exon of FvePHP (Fig. 5c). These results suggest that the splice donor site mutation caused abnormal RNA splicing in nlef .

Expression pattern and subcellular localization of FvePHP
To explore the temporal and spatial expression of FvePHP in YW plants, we conducted reverse transcription PCR (RT-qPCR) analysis in samples from the roots, petioles, leaves, f lowers, small green fruits, ripening fruits, and shoot apices of YW. The . Location and identification of candidate genes. a ED value distribution of SNPs and InDels on chromosomes. The x-axis shows the chromosome name. Colored points represent the ED value. The black line represents the fitted ED value, and the red dotted line represents the significance correlation threshold. b InDel markers used to screen candidate genes. The horizontal line represents chromosome Fvb1 and the vertical line represents the location of the InDel marker. Two splice-site mutant genes are shown at the candidate interval. c The dCAPs marker was used to analyze the candidate gene FvH4_1g25470. PCR products were digested with SmaI and detected by 3% agarose gel electrophoresis (lanes 1-9). Control samples were not digested with SmaI. SNP 17218907 was C:C genotype in YW and RG plants, T:T genotype was in nlef and six recombinant individuals. results showed that the expression of FvePHP in YW leaves was higher than that in other tissues (Fig. 6a). The online website https://predictprotein.org/ [25] was used to predict the subcellular location of FvePHP, FvePHP-IR, and FvePHP-PED. Based on the prediction results, all three proteins, FvePHP, FvePHP-IR, and FvePHP-PED, were localized in the nucleus, with a prediction confidence score of 47, 72, and 60, respectively (Supplementary Data Fig. S2c). Furthermore, using the NLS_Mapper website (http://nls-mapper.iab.keio.ac.jp/cgi-bin/NLS_Mapper) [26][27][28], the signal peptides of nuclear localization signals located in the N termini of proteins were successfully predicted in three proteins (Supplementary Data Fig. S2b). To verify the subcellular location of the FvePHP protein with confidence score of 47, we transiently expressed the 35S::FvePHP-GFP fusion gene in tobacco leaf epidermal cells. Confocal microscopy was used to detect the protein signals. As shown in Fig. 6b, the fusion protein signal of FvePHP-GFP was colocalized with 6-diamidino-2-phenylindole in the nucleus of the epidermal cells of tobacco leaves, while the signal of 35S::GFP was distributed in the nucleus and the cytoplasm. All the above results indicated that FvePHP is expressed in all tissues and implies that its protein functions in the nucleus. Therefore, we speculate that FvePHP-IR and FvePHP-PED with higher confidence scores are also localized in the nucleus. Table 2. Location, annotation, and nucleotide types of SNPs and InDels in different samples.  with RG genotype plants, respectively. The numbers in columns headed nlef, nlef pool, RG, and RG pool represent read depths. 'Effect' represents genome location information of variant nucleotides or sequences. N was no read support. 'NR_annotation' represents genes that were annotated using the NCBI database.

Virus-induced gene silencing in 'Yellow Wonder' plants
To confirm the hypothesis that the low expression level of FvePHP might be a major cause for the developmental defects in the nlef mutant, virus-induced gene silencing (VIGS) assays were performed in the YW plants. Tobacco rattle virus (TRV) vector was used to carry out this experiment. Agrobacterium tumefaciens GV3101 carrying TRV2-FvePHP and TRV1 was injected into 1month-old leaves of YW plants, and a mixture of TRV2 and TRV1 was used as a control. Three transformed plants were obtained with an obvious phenotype 3 months after infection by A. tumefaciens (Fig. 7a). As shown in Fig. 7a, the plant height of TRV2-FvePHP #1, #3, and #5 plants was lower than that of the control, and the number of serrations in the leaf margin was not significantly different ( Fig. 7b and c). The leaves of the three transformed plants showed abnormal development, as shown in Fig. 7b. RT-qPCR results shown that expression levels of FvePHP were significantly reduced in silenced plants (Fig. 7d). TRV2-FvePHP lines did not display the same phenotype as mutant plants, which could be attributed to the low efficiency of VIGS in strawberries.

Overexpression of FvePHP in nlef mutant recovered leaf phenotype and delayed flowering
To further confirm the phenotypes of nlef caused by a mutation of FvePHP, a complementation test was conducted. The fulllength CDS of FvePHP was cloned into pRI101-AN (35S::FvePHP), and then transformed into the nlef mutant by Agrobacteriummediated stable transformation with leaf pieces as explants. Two transgenic lines were obtained and identified by PCR (Supplementary Data Fig. S4). These transgenic lines on media have a similar phenotype, and they completely rescued the mutant defective in leaf serration ( Fig. 8c and d).
In order to further observe the phenotype, two transgenic lines were transplanted in a chamber environment. To analyze the normal transcript level of FvePHP, specific primers from the f lanks of the side of the second intron were designed (Supplementary Data Table S1). The relative expression level of FvePHP was analyzed using RT-qPCR in transgenic lines, YW and the nlef mutant. The transcript level of FvePHP was significantly increased in transgenic Lines 3 and 13 compared with YW and the nlef mutant (Fig. 8f). We further measured morphological traits, comprising plant height, leaf let length, leaf let width, and the serration number of the leaf let margin (Fig. 8h-k). Although plant height, leaf let width, and serration number of transgenic plants were significantly lower than those of YW, these indexes were significantly higher than those of the mutant. In addition, mutant plants started f lowering at 55 days in the chamber, whereas the wildtype and FvePHP-overexpressing plants started f lowering at 66 days (Fig. 8g, Supplementary Data Fig. S6). Further, the expression level of key f lowering-time genes was analyzed ( Supplementary Data Fig. S7). The expression of FveAP1 was reduced in transgenic lines overexpressing FvePHP in the mutant compared with YW (Supplementary Data Fig. S7a and b). In addition, the expression of FveTFL1 was lower in the transgenic plants compared with YW. The expression of FveSOC1 was increased in transgenic Line 13 and similar to that in YW transgenic Line 3 (Supplementary Data Fig. S7c and d). These data showed no great differences in the expression of FveAP1, FveTFL1, and FveSOC1 between transgenic plants and YW (Supplementary Data Fig. S7). The above results demonstrated that transgenic plants largely rescued the phenotypes of nlef plants.

Expression analysis of flowering-related genes
To better explore the reason for early f lowering in the nlef mutant, RT-qPCR was used to investigate several key f lowering-related genes [29] in YW and the nlef mutant. Higher expression of positive regulators of f lowering , FveSEP1, FveSEP3, FveAP1, FveFUL, and FveFT, was found in the mutant (Fig. 9a-e). The expression levels of FveAGL42, FveLFY, and FveSOC1 were similar to that of YW ( Fig. 9g-i). TERMINAL FLOWER1 (TFL1) is a f lowering repressor in seasonal f lowering accessions and FveTFL1 in YW is not functional [30,31]. Surprisingly, we found that the mRNA expression level was decreased in the mutant compared with YW (Fig. 9f). These results suggest that several f lowering-related genes are highly expressed, leading to early f lowering phenotype in the nlef mutant.

Discussion
In this study we identified a mutant, nlef , with narrow leaf lets and an early-f lowering phenotype in woodland strawberry (Fig. 1).
We have provided evidence to support FvePHP as a candidate gene that encodes a putative α DNA polymerase. The phenotypes of nlef were similar to those of A. thaliana DNA polymerase α mutants icu2 and esd7 [32,33], especially in the early f lowering phenotype. In the 1.9-Mb region of Fvb1, only FvePHP showed a splicing event caused by a mutation of the 5 splice site (Fig. 4e,  Supplementary Data Fig. S1). The dCAPs marker of this SNP showed co-segregation with the mutant (Fig. 4c). Furthermore, the VIGS experiment indicated that low expression of FvePHP caused leaf development defects. The complementation assay revealed that overexpressing FvePHP in the mutant can restore the leaf phenotype and f lowering. Overall, these results indicate that FvePHP is the candidate gene in the nlef mutant.
Cloning and sequencing of full-length cDNA indicated that the mutation at the splice site at the 5 end of the second intron led to a mis-splicing event affecting the PHP domain of the polymerase. Intron retention would lead to more than loss of function of the PHP domain. In the spliceosome, the U1 snRNP is expected to recognize the 5 splice site of the intron, and recruit the U2 snRNP to cut out the intron [32,33]. In FvePHP-IR, U1 could not recognize the mutated 5 splice site, and the intron is retained, which results in the premature stop codon and loss of the 3 end of the protein.
Meanwhile, exon deletion also led to protein function loss of FvePHP. In the spliceosome, U1 snRNP could recognize new 5 splice site, and recruit the U2 snRNP to cut the sequence between two splice sites [32,33]. In FvePHP-PED, U1 snRNP recognized new 5 splice site because of the original splice site mutation. This resulted in a 54-bp deletion in the CDS. Therefore, the splice site mutation results in mis-splicing and generates new transcripts. In Escherichia coli, DNA polymerase III contains a PHP domain to maintain its stability and activity, and its mutations led to decreased DNA polymerase III activity [34]. In Thermus thermophilus the PHP domain ensures the activity of family X DNA polymerases in base excision repair, and mutations in the PHP domain decrease the repair ability of these polymerases [35]. Moreover, the PHP domain of family X DNA polymerases exhibits 3 -5 exonuclease activity. Therefore, the conserved PHP domain plays an important role in maintaining DNA polymerase activity and stability [34]. In this study the PHP domain of FvePHP-IR was disrupted, while FvePHP-PED was a relatively complete protein that retained the active sites (Fig. 5c). These putative activation sites are highly conserved and are involved in metal chelation (Supplementary Data Fig. S3) [36]. Active sites of polymerase can coordinate the catalytic metal to ensure the integrity of the polymerase activity [34]. In FvePHP-PED, although the functionally important active sites of catalytic metals were present, the PHP domain was seriously damaged and we speculated that FvePHP-PED might have weak activity (Supplementary Data Fig. S3). In FvePHP-IR, not only the active sites but also the PHP domain was seriously damaged. In brief, alternative splicing generated new proteins that had non-functional or low functional activity in nlef .
Eukaryotic polymerases involved in the cell cycle process affect plant growth and development [37,38]. In A. thaliana, mutation in eukaryotic polymerase α/ICU2 inf luences the development of the SAM and the cell cycle [39]. It is known that the leaves originate from the SAM and leaf growth and development are closely inf luenced by the cell cycle [40,41]. In this study, the leaf epidermal cells were observed by scanning electron microscopy, which showed that the nlef mutant had significantly reduced cell size.
Eukaryotic polymerase plays a key role in DNA replication [42], DNA repair, genomic stability, plant growth, and developmental processes, transcriptional gene silencing [43], and epigenetic control [44], and ensures the stability of histone modification [45]. Reactive oxygen species (ROS) can be produced when DNA replication is blocked [46]. ROS are considered to be signaling molecules in response to stress [47], not only affecting cell proliferation and size, but also promoting plant f lowering through promoting expression of the f lowering genes FT1 and FT2 in litchi leaves [48,49]. Besides, eukaryotic polymerase plays a key role in maintaining genomic stability; for example, Arabidopsis polymerase α ensures the stable maintenance of repressive histone modifications, and f lowering genes FT and AG are upregulated because of loss of H3K27me3 in the DNA polymerase α mutant [45]. Eukaryotic polymerase δ can activate the epigenetic marks resulting in upregulation of the expression of FT and other f lowering-related genes [39,50,51]. In B. distachyon, mutation of POLD3 causes the enrichment of the chromatin mark H3K4me3 in the f lowering genes VRN1, SEP3, and AG, which leads to high expression of these f lowering genes [22].
In this study the nlef mutant showed an early-f lowering phenotype, and we found that the expression levels of FveSEP1, FveSEP3, FveAP1, FveFUL, and FveFT were significantly higher in the nlef mutant compared with YW. These results indicated that the FvePHP mutation activated the expression of f lower-related genes, leading to an early-f lowering phenotype. The relationship between FvePHP and histone modification of f lowering genes in strawberry is unclear, so it needed to be explored further.

Conclusions
In this study we identified a DNA polymerase gene (FvePHP) with a mutation at a splice site that produced two different transcripts, which led to the loss of function of FvePHP. On one hand, the FvePHP mutation caused multiple developmental defects, such as reduced plant size and leaf area. On the other hand, the mutation promoted the expression of several f lowering-related genes leading to early f lowering in strawberries.

Plant materials and cultivation environments
Diploid woodland strawberry (F. vesca) 'Yellow Wonder' (YW) and 'Ruegen' (RG) were used in this study. The wild-type RG strawberry has red fruit and a red petiole and YW, with white fruit, is a natural FvMYB10 gene mutant of RG. Mutant nlef was obtained from an ethyl methanesulfonate-mutated YW population. Selfing of the nlef generation was used to obtain the M 2 generation. The nlef mutant plant was identified in the M 2 generation and was crossed as the female parent to RG as the male parent. Selfpollination of F 1 -generation plants produced F 2 populations. The phenotypes of the progeny from selfed nlef plants and the F 2 population are shown in Supplementary Data Fig. S5. All plant materials were grown in a greenhouse at Shenyang Agricultural University in September. Seedling morphology was determined between 16 September and 15 October in a greenhouse at 26 • C under 12-hours light/12-hours dark conditions. Plant height, plant diameter, petiole length, leaf let length, and leaf let width were measured with a straightedge. The leaf let area was measured using ImageJ software. The third newly fully opened leaves were selected to measure the leaf shape traits.
Wild-type YW and mutant nlef seedlings were used for scanning electron microscopy and cultured in Murashige and Skoog (MS) medium at 23 • C under long-day (16 hours light/8 hours dark) conditions. Nicotiana benthamiana was used for transient

Scanning electron microscope analysis
Young leaves (3 mm × 5 mm) were fixed in 2.5% glutaraldehyde solution at 4 • C for 48 hours according to a protocol by Lin et al. [52] The samples were then observed under a scanning electron microscope (Regulus 8100, Hitachi, Japan) at the Analytical and Testing Center, Shenyang Agricultural University, Shenyang. Cell area was determined with ImageJ software.

Identification and analysis of SNPs/InDels
A method using Euclidean distance (ED) [57] was employed to assess the association of these SNPs and InDels. The ED value was calculated as follows: where the four letters A, G, C, and T represent the frequency of the corresponding DNA nucleotides in the RG pool and nlef pool. The ED value of non-target sites should tend to 0. The higher the ED value, the more closely the variant is related to target sites.

Primer design of InDel and dCAPs markers
Primer 5 software [58] was used to design the InDel marker primers. The primers for the dCAPs markers were designed using dCAPs Finder 2.0 (http://helix.wustl.edu/dcaps/dcaps.html). The primers used are shown in Supplementary Data Table S1.

RNA extraction and gene expression analysis
Total RNA was extracted using the CTAB method [53]. RNA (1 μg) was reverse-transcribed in a 20-μl reaction using the PrimeScript™ RT Reagent Kit with gDNA Eraser (RR047a, TaKaRa, Otsu, Japan). The reaction was performed according to the manufacturer's instructions. The cDNA obtained by reverse transcription was diluted five times and used in a real-time PCR experiment in an ABI 7500 system (Applied Biosystems, Foster City, CA, USA) performed with 10 μl of the reaction mix, which included 0.5 μl cDNA, 5 μl UltraSYBR Green Mixture reagent (CoWin Biotech, Beijing), 0.4 μl primer, and 4.1 μl ddH 2 O, and the relative expression levels were calculated using the 2 − ct method [59].

Gene cloning and structure analysis
Full-length CDSs of YW and nlef were amplified using the primers listed in Supplementary Data Table S1. The PCR products were detected by 1.5% agarose gel electrophoresis and purified using a gel extraction kit (TIANGEN Biotech, Beijing). The PCR products were ligated into the pEASYT1 vector, and Sanger sequencing was performed. The peak values of the sequencing sites were observed using DNASTAR. Sequence alignment was performed using DNAMAN V6 (Lynnon BioSoft, Canada).

Subcellular localization
The full-length CDS of FvePHP without the termination codon was amplified using specific primers with restriction sites (BamHI and KpnI) and then ligated to the pRI101-GFP vector by digestion. The pRI101-FvePHP-GFP and pRI101-GFP vectors were introduced into tobacco leaves via Agrobacterium-mediated transformation. The GFP f luorescence signal was detected and imaged using a laser confocal f luorescence microscope (TCS SP8-SE; Leica, Wetzlar, Germany). The DNA dye 4 ,6-diamidino-2-phenylindole was used to label the cells to visualize the nucleus.

Virus-induced gene silencing in wild-type plants
To verify the function of FvePHP, a 501-bp cDNA fragment of FvePHP was cloned into the BamHI and SacI sites of the pTRV2 vector. The method was based on a protocol by Lu et al. [60] with slight modifications as follows. The recombinant vector was named pTRV2-FvePHP. The plasmids pTRV1, pTRV2, and pTRV2-FvePHP were transformed into A. tumefaciens GV3101. GV3101-pTRV1, GV3101-pTRV2, and GV3101-pTRV2-FvePHP strains were grown in 5 ml yeast extract peptone (YEP) liquid medium (including 50 μg/ml kanamycin and 100 μg/ml rifampin) at 28 • C for 24 hours at 180 rpm. Then, 1 ml of the bacterial liquid was transferred to 50 ml of YEP liquid medium followed by culture at 28 • C for 6-8 hours at 180 rpm. The bacteria were collected (10 mM MES, 10 mM MgCl 2 , and 200 mM acetosyringone) by centrifugation and the OD 600 value was adjusted to 1.2-1.5 using a spectrophotometer. GV3101-pTRV2 and GV3101-pTRV2-FvePHP were mixed with GV3101-pTRV1 at a 1:1 ratio for 1-2 hours at 28 • C. The bacterial liquid was injected into the leaf through a 1ml syringe without a needle. All plants injected with bacterial liquid were cultured in the dark for 24 hours and then placed in an incubator at 23 • C (12 hours/12 hours dark) until they were transplanted to a greenhouse at Shenyang Agricultural University on 20 August, and were regularly observed multiple times until 25 October.

Overexpression vector construction and plant transformation
The full-length CDS of FvePHP was amplified by RT-PCR from YW and cloned into pRI101-AN to generate the overexpression plasmid p35S::FvePHP. The primers are shown in Supplementary Data Table S1. The plasmid was identified by sequencing. The plasmid was introduced into Agrobacterium strain GV3101 via the freeze-thaw method and transformed into nlef strawberry. The transformation method was as described previously [61]. After 8 months, the transgenic plants were rooted in 1 / 2 MS medium with 0.1 mg/l indole 3-butyric acid (IBA) for 1 month. Then, the transgenic plants were transplanted and grown under greenhouse conditions. The transgenic strawberry plants were examined by PCR and RT-qPCR at DNA and RNA levels, respectively.

cDNA library construction and RNA sequencing
Total RNA was extracted from YW and nlef leaves using TRIzol. RNA integrity was assessed using the RNA Nano 6000 Assay Kit of the Agilent Bioanalyzer 2100 system (Agilent Technologies). Sequencing libraries were generated using the NEB Next ® Ultra™ RNA Library Prep Kit for Illumina ® (NEB, USA). Libraries were sequenced by Wuhan Frasergen Bioinformatics Co. Ltd on an Illumina HiSeq X-Ten platform (Illumina, CA, USA) and 150-bp paired-end reads were generated.

Differential gene expression analysis
Fragments per kilobase per million bases (FPKM) was used to represent the level of gene expression [64]. Differentially expressed genes were analyzed using the R language package DEseq2, and differential genes were identified based on the FDR and fold change (logFc). Here, the screening standard was FDR <.05 and log 2 FC value >1 or <−1.

Statistical analysis
Statistical analysis was performed using DPS 9.01 software. Statistical significance was tested using Student's t-test, and the significance level was set at * P < .05 and * * P < .01.
wrote and modified the manuscript. All authors involved in this study read and approved the manuscript.

Data availability
The BSA sequence data have been deposited in the Genome Warehouse in the National Genomics Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, under accession number CRA005241, and are publicly accessible at https://bigd. big.ac.cn/gwh. RNA-seq data were submitted to NCBI and the accession number is PRJNA597925.